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Open 



Microbial metabolism in aromatic-contaminated environments has important ecological implications, and 
obtaining a complete understanding of this process remains a relevant goal. To understand the roles of 
biodiversity and aromatic-mediated genetic and metabolic rearrangements, we conducted 'OMIC 
investigations in an anthropogenically influenced and polyaromatic hydrocarbon (PAH)-contaminated 
soil with (Nbs) or without (N) bio-stimulation with calcium ammonia nitrate, NH 4 N0 3 and KH 2 P0 4 and the 
commercial surfactant Iveysol, plus two naphthalene-enriched communities derived from both soils 
(CN2 and CN1, respectively). Using a metagenomic approach, a total of 52, 53, 14 and 12 distinct 
species (according to operational phylogenetic units (OPU) in our work equivalent to taxonomic 
species) were identified in the N, Nbs, CN1 and CN2 communities, respectively. Approximately 10 
out of 95 distinct species and 238 out of 3293 clusters of orthologous groups (COGs) protein 
families identified were clearly stimulated under the assayed conditions, whereas only two species 
and 1465 COGs conformed to the common set in all of the mesocosms. Results indicated distinct 
biodegradation capabilities for the utilisation of potential growth-supporting aromatics, which results 
in bio-stimulated communities being extremely fit to naphthalene utilisation and non-stimulated 
communities exhibiting a greater metabolic window than previously predicted. On the basis of 
comparing protein expression profiles and metagenome data sets, inter-alia interactions among 
members were hypothesised. The utilisation of curated databases is discussed and used for first 
time to reconstruct 'presumptive' degradation networks for complex microbial communities. 
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Introduction 

Polyaromatic hydrocarbons (PAHs) are widely dis- 
tributed in the environment owing to their abundance 
in crude oil and their widespread use in chemical 
manufacturing (Kastner, 2000). PAHs are pollutants of 
great concern owing to their toxicity, mutagenicity 
and carcinogenicity. A number of microorganisms, via 
the action of Rieske non-haem iron oxygenases, have 
the ability to grow on PAHs as a sole carbon and 
energy source (Beil et aL, 1998; Roling et aL, 2003; 
Witzig et aL, 2006; Peng et aL, 2008). 

The majority of efforts aimed at understanding 
microbial responses to aromatic compounds have 
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been focused on genomic (Jimenez et aL, 2002; 
Nogales et aL, 2008; Puchalka et aL, 2008), tran- 
scriptomic (Dommguez-Cuevas et aL, 2006; Yuste 
et aL, 2006) and proteomic (Santos et aL, 2004; 
Segura et aL, 2005; Kim et aL, 2006; Kurbatov et aL, 
2006; Tomas-Gallardo et aL, 2006) analyses con- 
ducted in pure cultures. Among PAH- (including 
naphthalene, fluorene, phenanthrene, pyrene and 
dibenzofuran, to cite some) mineralising strains, a 
number of bacteria have received special attention. 
Those include members of the common genera 
of Pseudomonas, Sphingomonas, Cycloclasticus, 
Burkholderia, Rhodococcus, Polaromonas, some 
novel genera of Neptunomonas and Janibacter, some 
thermophilic bacteria of Nocardia and Bacillus, 
some anoxic bacteria of Deltaproteobacteria and 
Alcaligenes, and high molecular weight PAH- 
degrading bacteria of Mycobacterium, Stenotropho- 
monas and Pasteurella; a list with known pure 
bacterial cultures capable of degrading PAHs can be 
seen in a recent review by Lu et aL (2011). 

However, a large number of culture-independent 
techniques have shown that pollutant- degrading 
organisms enriched in the laboratory often do not 
have an important role in the in situ biodegradation of 
pollutants and that the diversity of pollutant-degrad- 
ing organisms in polluted environmental samples is 
much greater than the diversity determined from 
cultivation (Abulencia et aL, 2006; Liu et aL, 2009; 
Boronin and Kosheleva, 2010; Yagi et aL, 2010; Liang 
et aL, 2011). Under this scenario, it is relevant to use 
molecular microbial tools to define key catabolic 
players at contaminated sites to predict pollutant 
degradation networks in the environment and to 
suggest methods for rational intervention associated 
with the implementation of bioremediation (Vilchez- 
Vargas et aL, 2010). However, the number of inte- 
grative 'omic' investigations that have been carried 
out in PAH-associated microbial communities is 
limited (Kweon et aL, 2007; Powell et aL, 2008; 
Selesi et aL, 2010), because of the incomplete 
genomic information and curated databases available 
(Perez-Pantoja et aL, 2009, 2012). 

Taking all of the above information into consid- 
eration, we performed a thorough and holistic (or 
ecosystems biology approach) phylogenetic, func- 
tional and proteomic analysis of the key players in a 
sample of strongly anthropogenically influenced, 
PAH-contaminated soil (N) and a naphthalene- 
enriched community derived from this soil (CNl). 
This study was carried out using also samples of the 
same soil bio-stimulated with calcium ammonia 
nitrate, NH 4 N0 3 and KH 2 P0 4 , and the commercial 
surfactant Iveysol (Ivey International Inc., Campbell 
River, BC, Canada) (herein referred as Nbs), and the 
naphthalene-enriched community derived from it 
(CN2). In all cases, the biodegradation networks of 
the respective whole communities were recon- 
structed. This study clarified the genomic and 
proteomic basis for the purpose of understanding 
microbial biodiversity, ecology and function in 



response to both PAHs (represented by naphthalene) 
and bio-stimulation. It should be noted that our 
metaproteomic investigations were restricted to 
naphthalene-enriched communities because of their 
lower complexity as compared with the soil samples 
as well as the larger assembled sequences obtained 
through direct pyro-sequencing for both samples. 

Materials and methods 

General methods and 'OMIC data analysis and 
processing 

Full descriptions of the materials and methods used 
for soil characterisation; hydrocarbon analysis; DNA 
extraction; construction of 16 S RNA gene clone 
libraries, sequencing and phylogenetic analysis; 
denaturing gradient gel electrophoresis analysis; 
metagenomic setup and sequencing, assembly and 
gene prediction; metaproteomic setup and protein 
extraction, separation and identification and data 
processing are available in the Supplementary 
Materials and methods. 

Soil sample collection and preparation of bio- 
stimulated soil and enrichment cultures 
Soil samples (herein referred to as N) were obtained 
from a parcel on the northern Iberian Peninsula 
contiguous to a chemical plant (Lugones, Oviedo, 
Spain; 40°18 / 33 // N, B^^l^W, at an altitude of 
300m). The chemical plant (Supplementary Figure 
Si) was used for several decades for the production 
of naphthalene, phenols and other compounds from 
coal tar processing as well as for manufacturing 
resins. A considerable amount of other chemical 
products (pesticides, solvents, etc.) was stored, 
although probably not manufactured, in the plant. 
In 1989 the plant was closed and then used for years 
to store chemical waste, particularly polychlori- 
nated biphenyls, coolants and other unspecified 
products. In 2006 most of the buildings were 
demolished and the characterisation of soil contam- 
ination started. Thus, a significant amount of the 
PAH detected were conceivably formed by the 
enriched bacterial populations present in the soil 
through a natural attenuation effect. The top soil 
was sampled at a depth of 0-30 cm on February 2007 
(soil temperature 18 °C). Three replicates (1kg each) 
were collected within a lm distance, and the 
samples were kept in open plastic bags in the dark 
at 4 °C. Vegetation and other non-soil materials, 
including cobbles, were removed prior to homo- 
genising the samples. Immediately after acquiring 
the soil samples, they were sieved (2-mm mesh 
size), followed by mixing of representative subsam- 
ples of the triplicates samples, and 100 g aliquots 
of sample N were used for chemical analysis. In 
addition, 10 g aliquots were stored at -20°C in 
sterile flasks for DNA- and proteome-based analyses. 

Bioremediation was performed over approxi- 
mately 900 m 3 of the contaminated soil (N), where 
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9 tons of dehydrated calcium ammonia nitrate and 
3 tons of a mixture of dehydrated NH 4 N0 3 and 
KH 2 P0 4 combined with 45001 of the commercial 
surfactant Iveysol were applied to the homogenised 
soil. The C:N:P ratio that was applied was 100:10: 1, 
as recommended for bioremediation purposes 
(Gallego et al., 2007a). The treatment was performed 
over 231 days, and the biopile was routinely 
watered and tilled to maintain humidity (15-20%) 
and adequate oxygenation. Samples of the top 
bio-stimulated soil, herein named Nbs, were taken 
and processed using the same protocols as described 
above for the original soil. 

Enrichment cultures were performed in Bushnell 
Haas (Sigma Chemical Co., St Louis, MO, USA) 
mineral medium that contained naphthalene at a 
concentration of 0.1% (w/v) as the sole carbon 
source as described previously (Gallego et al., 
2007b). The composition of the medium was as 
follows: MgS0 4 H 2 0 (0.20 gl" 1 ), CaCl 2 -2H 2 0 
(0.02gl" 1 ), KH 2 P0 4 (l.Ogl" 1 ), K 2 HP0 4 (l.Ogl" 1 ), 
NH 4 N0 3 (l.Ogl" 1 ) and FeCl 3 (0.05gl- a ). Two 
different inocula were used. The CNl enrichment 
culture was obtained by inoculating lg of the 
polluted soil (N) into a flask containing 100 ml of 
the culture medium; the CN2 culture was inoculated 
with 1 g of the same soil that had been subjected to 
a massive bioremediation (bio-stimulation) process 
(Nbs). The enrichment cultures were incubated at 
30 °C and 250r.p.m., in which 0.1% (v/v) of the 
culture was transferred to fresh medium each week. 



Results and discussion 

Sample characteristics 

An agronomic analysis of the soil N revealed a 
loamy clay soil with a pH of 8.2 and a conductivity 
of 0.13 dSm -1 clearly containing low amounts of 
the typically predominant ions (calcium, magne- 
sium, potassium and sodium); the detected natural 
organic matter, nitrogen and phosphorus levels 
in the soil (Supplementary Table Si) are charac- 
teristic of infertile soils. Gas chromatography-mass 
spectrometry (GC-MS) was used to quantify the 
levels of the 16 EPA (Environmental Protection 
Agency) priority PAHs present in the soil 
(Supplementary Figure S2). Totally, the soil exhib- 
ited an average concentration of 805 mg PAH per kg 
(Table 1), which is in the range or slightly lower 
to the level found in previously reported PAH 
contaminated soils (that is, 1667 mg kg" 1 in Ni 
Chadhain et al (2006); 589 mgkg- 1 in Richardson 
et al (2012); 335-8645 mg kg" 1 in Thavamani et al 
(2012); 3000 mg kg" 1 in Lors et al. (2012)). The bio- 
stimulated soil exhibited a total concentration of 
221.6 mg kg" 1 (average) of the 16 EPA priority PAHs 
and the quantified compounds are listed in Table 1. 
The degradation rate of naphthalene in N was too 
low and was difficult to be established owing to the 
long time transcurred in the presence of the 



Table 1 Level of 16 EPA priority PAHs present in the original (N) 
and bio-stimulated (Nbs) soils 



PAH 


Concentration (mgkg 2 ) 




Soil N 


Soil Nbs 


Naphthalene 


607 


174 


Acenaphthylene 


1.60 


0.40 


Acenaphthene 


16.60 


4.40 


Fluorene 


20.80 


5.96 


Anthracene 


21 20 


4 99 


Phenanthrene 


46.40 


24.23 


Fluoranthene 


30.40 


4.73 


Pyrene 


19.40 


4.02 


Benzo(a)anthracene 


12.00 


2.83 


Crysene 


12.50 


2.64 


Benzo(b)fluoranthene 


4.40 


1.34 


Benzo(k)fluoranthene 


3.10 


0.90 


Benzo(a)pyrene 


4.90 


1.27 


Indene(l,2,3-cd)pyrene 


3.40 


1.58 


Benzo(g,h,i)perylene 


1.50 


0.82 


Dibenzo(a,h)anthracene 


0.79 


0.18 



Abbreviations: N, non-bio-stimulated polyaromatic hydrocarbon- 
contaminated soil; Nbs, bio-stimulated polyaromatic hydrocarbon- 
contaminated soil; PAH, polyaromatic hydrocarbon. 
Gas chromatography-mass spectrometry (GC-MS) was used for 
quantification. 



contaminants; however, once submitted to bio- 
stimulation, we calculated the degradation rate of 
naphthalene in soil from time 0 to 231 days: 
1.818 mgkg" 1 soil per day. 

Denaturing gradient gel electrophoresis was used 
to survey the development of the microbial commu- 
nities in the enrichment cultures and to deduce 
the time point at which a stable community was 
obtained. We observed that the denaturing gradient 
gel electrophoresis profiles for CNl enrichment 
cultures changed between the 45 and 60 transfers 
(Supplementary Figure S3); changes in the profiles 
were also observed for CN2 after 35 and 60 transfers. 
These changes were subsequently maintained and 
samples subjected to at least 60 transfers were 
herein used for further investigations. The major 
bands were excised and sequenced, and their 
affiliation revealed the presence of microorganisms 
identified as members of the genera Achromobacter, 
Flavobacterium and Acidovorax in both enrich- 
ments, and Pseudomonas, Microbacterium, Lysobac- 
ter and to an endosymbiont of Acanthomaeba in 
CN2. Results indicated that consortia CNl and CN2 
were able to degrade 97% naphthalene at an initial 
concentration of 1000 p. p.m. within 72 and 80 h, 
respectively. 



Bacterial diversity and composition blueprints 
DNA isolated from each investigated microbial 
community was employed for a PCR-based 16S 
recombinant DNA (rDNA) gene diversity survey of 
the community structures in the samples. For this 
purpose, clone libraries were constructed as 
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described in the Supplementary Materials and 
methods, and the clones were fully sequenced to 
obtain the almost complete cloned 16S rDNA gene 
sequence. Additionally, and because of the low 
yields in the clone libraries of Nbs, we used the 16S 
rDNA gene partial sequences obtained in the 
metagenome survey. For this purpose, we used only 
those sequences that had a length > 600 nucleotides. 
All shorter sequences were discarded. A total of 670 
sequences (that is, N: 212; Nbs: 261 (86 clones + 175 
454 partial sequences); CNl: 90; CN2: 107) were 
obtained, and analysed. The overall phylogenetic 
composition in the libraries is shown in Figures 1-3, 
in which all operational phylogenetic units (OPUs) 
affiliated with twelve phyla of the domain Bacteria, 
namely, Proteobacteria followed to much lower 
extend by Bacteroidetes, Verrucomicrobia, Firmi- 
cutes, Actinobacteria, Chloroflexi, Cyanobacteria, 
Planctomycetes, Spirochaetes and the candidatus 
phyla OP8, TM7 and WCHBl (Figure 4). 

The clones were classified into 124 bacterial 
operational phylogenic units (OPUs) (Lopez-Lopez 
et al., 2010) (Table 2 and Supplementary Figure S2). 
Higher values for Shannon-Wiener's and Good's 
coverage indexes indicated that communities N and 
Nbs were much more complex (51 OPUs and 53 
OPUs were detected, respectively) than the CNl and 
CN2 (13 OPUs and 12 OPUs, respectively) commu- 
nities, which were dominated by few microbial 
species or strains and exhibited a rather simple 
structure (Figures 1-3). It should be recalled that 
each of the distinct OPUs detected was identified as 
a putative single species owing to the high sequence 
identity (Yarza et al., 2008). However, it should also 
be noted that the sequencing survey covered over 
75% of the expected OPU diversity in all cases as it 
can be deducted from the high Good's coverage in all 
samples (Table 2), and the rarefraction curves 
indicating closeness to saturation (Supplementary 
Figure S4). A large proportion of the obtained 
sequences affiliated with known families and 
clustered with branches represented by cultured 
microorganisms that have previously been found in 
contaminated environments (Figures 1-3 and 
Supplementary Figures S5 and S6), and thus, they 
do not appear to be specific to the soil and the 
enrichment samples that were investigated here. 



Figure 1 A neighbour-joining tree of the proteobacterial 16S 
rDNA gene sequences from the gene sequences originated from N, 
Nbs, CNl and CN2 communities. For the reconstruction, only the 
almost complete sequences of the clones from the four samples 
had been used. Besides, partial 454 sequences of >600 nucleo- 
tides were inserted in the tree using the Parsimony tool 
implemented in ARB. The number of sequences comprised in 
each identity cluster (OPU) is specified for each sample. The 
colour code for the 16S rDNA sequences is as follows: N (green), 
Nbs (purple), CNl (blue), CN2 (red). Whether an OPU contains 
sequences of more than one sample, the OPU denomination is 
written in black followed by the code colour and sequence 
number of each studied sample. 



Whatever the case, despite soil characteristics 
and specific pollutants composition may differ, 
the biodiversity found in the original soils (N and 
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I - "- Pseudomonas geniculate, AB021 404 
J \7 OPU 20 (Stenotrophomonas chelatiphaga 
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■=> N clones 

i=> Nbs clones and 454 sequences 
c> CN1(N) clones 
c> CN2 (Nbs) clones 




OPU 1 (Pseudomonas stutzeri GV1) 1N; 12Nsb; 48CN2 



OPU 2 {Pseudomonas alcaligenes) 3 Nbs 
Pseudomonas otitidis, AY953147 
4/ OPU 3 (Pseudomonas stutzeri GV3) 6N; 1Nsb 

OPU 4 {Pseudomonas stutzeri sp.) 8Nbs, 1CN2 

OPU 5 (Pseudomonas stutzeri M14C) 5N; 10Nbs 
Nbs132 (OPU 96) 

OPU 6 (Pseudomonas tuomuerensis) 21Nsb 

Pseudomonas mendocina, Z76674 
N-179 (OPU 7) 

Pseudomonas argentinensis, AY691188 
Pseudomonas flavescens, U01916 
Nbs117 (OPU 97) 




OPU 8 (Pseudomonas pertucinogena) 8Nbs 
OPU 9 (Pseudomonas sp.) 20Nbs 

OPU 10 (Pseudomonas guineae) 4N, 1Nbs 

Pseudomonas anguilliseptica, X99540 
Pseudomonas borbori, AM 11 4527 

OPU 11 (Pseudomonas amygdali) 65N 



N-193 (OPU 12) 

Pseudomonas sp. VET-8, EU781734 
Pseudomonas brassicacearum, AF 100321 
Pseudomonas fluorescens, AJ308308 
— Pseudomonas orientalis, AF064457 



H I OPU 13 (Pseudomonas putida) 22N 

_J — Pseudomonas japonica, AB1 26621 

47 OPU 14 (Pseudomonas sp.) 3Nbs 

r Azotobacter vinelandii, AB1 75657 

L N-05(OPU15) 

Pseudomonas balearica, U26418 

uncultured bacterium, GQ979955 

OPU 16 (Pseudomonas sp.) 3Nbs 



10% 



Figure 2 A subset of the tree shown in Figure 1, where the genealogical composition of the members of the family Pseudomonadaceae is 
shown. Layout, reconstruction and colour codes used are the same as in Figure 1. 



Nbs) (Shannon indexes of 2.81 and 3.32, respec- 
tively) are in the range of that found in other PAH- 
contaminated soils. For example, Shannon indexes 
of 1.74-2.78 (Vivas et aL, 2008), 2.38-2.96 (Thavamani 
et aL, 2012) and 4.87-5.01 (Martin et aL, 2012) 
were observed in soils that had been historically 
contaminated with PAHs. 

The dominant group among the Gammaproteo- 
bacteria consisted of members of the genus Pseudo- 
monas (Figures 2 and 4). The N sequences were 
distributed rather evenly within this genus, with 
approximately 40% of the total gammaproteo- 
bacterial sequences branching in the P. fluorescens, 
P. frederiksbergensis and P. amygdale lineages, 
followed by those affiliated with the Pseudoxantho- 
monas spp. (20%) and P. putida (14%) lineages 
(Supplementary Table S2). A similar even distribu- 
tion was observed in the sample Nbs, being as well 
the members of Pseudomonadaceae the most abun- 
dant. However, in this case, the Pseudomonas 
species composition shifted towards the major 



representatives P. tuomourensis, P. pertucinogena, 
P. stutzeri genomovar 1, and yet undescribed 
Pseudomonas species. These results clearly contrast 
with the distribution of the CN2 sequences, where 
just representatives of P. stutzeri (46%) lineages 
(Supplementary Table S2) were observed in accor- 
dance with this being one of the major represented 
lineages in Nbs. Finally, in CNl, no representation 
of the Pseudomonas genus was found, but six 
sequences representing Pseudoxanthomonas were 
identified (6% of the total) (Figures 1 and 4 and 
Supplementary Table S2); this is in agreement with 
previous observations made in PAH microcosm 
experiments that suggest that, although, Pseudomonas 
spp. can be easily enriched and use a variety of 
organic substrates, they may be repressed by other 
Proteobacteria (Rhee et aL, 2004). 

The alphaproteobacterial sequences detected in 
N were widely distributed among Rhizobium, 
Devosia, Novosphingobium, Sphingomonas and 
Brevundimonas, which were the most represented 
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(Haloplasma sp.) 9N 

EF999972 
AF349724 
r N-214(OPU72) 

Streptococcus sanguinis, 
T Nbs-56 (OPU 73) 

Trichococcus pasteurii, X87 1 50 
(OPU 74) 

bacterium, FN396939 
Erysipelothrix inopinata, AJ5506 1 7 

Acholeplasma brass/cae,AY538163 
Nbs187 (OPU 104) 
r Microbacterium liquefaciens, X77444 
J Microbacterium oxydans, Y1 7227 

P— Microbacterium saperdae, AB0047 1 9 
1 r Microbacterium sp. 7_1K, EF540449 
\_y- N-31 (OPU 76) 
>— Microbacterium invictum, AM949677 
Actinotaiea fermentans, X83805 
Nbs246 (OPU 105) 
Georgenia sp. 2P7AnC, EU977729 

"(OPU 106) 
Georgenia ruanii, DQ203185 

■ AJ879697 

1107) 



OPU 109 (Desulfosporosinus sp.) 4Nbs 

Desulfosporosinus lacus, AJ 582757 
Desulfosporosinus orientis, Y1 1 570 
MB7-1, DQ453797 
iltured Clostridiales bacterium, EU283546 



"t^opi 



OPU 110 {Sedimentibacter 
OPU 67 {Sedimentibacter sp.) 6Nbs 



| y/ /' // ^OPU 111 (Anaerovorax sp.) 2Nbs 
■ Nbs255 (OPU 112) 

uncultured soil bacterium, DQ378233 

Fusibacter paucivorans, AF050099 

|7 OPU 66 (Soehngenia sp.) 5Nbs 
P~ Soehngenia saccharolytica.K 
I — uncultured bacterium, EF590C 

rV OPU 113 (Uncultured clostridiales) 2Nbs 

I V- uncultured bacterium, AY945885 

^ uncultured bacterium, DQ4471 78 

J Parvimonas micra, AF542231 

uncultured Peptostreptococcus sf 
- N-83(OPU68) 
s-23 (OPU 69) 



OPU 77 {Proteiniphilum sp.) 39Nsb 

Proteiniphilum acetatigenes, AY742226 
Petrimonas sulfuriphila, AY570690 
uncultured bacterium, FN563283 

iltured bacterium, FN563266 



Bacteroides sp., FM204969 
CU927288 

Nbs-42 (OPU 78) 

Parabacteroides goldsteinii, AY974070 
Dysgonomonas gadei, Y1 8530 
(OPU 118) 

OPU 79 (Uncultured bacteroidetes) 4Nsb 
OPU 80 (Mucilaginibacter sp.)1N,2CN2 




Mucilaginibacter /(ame/nonens/s, AB330392 
Mucilaginibacter oryzae, EU1 09722 

Pedobacter kwangyangensis, EU834277 
iltured Bacteroidetes bacterium, FJ535138 
J-106 (OPU 81) 

Pedobacter koreensis, DQ092871 
Nubselia zeaxanthinifaciens, AB2641 26 
OPU 82 (Olivibacter soli) 2N 
Pseudosphingobacterium domesti, AM407725 
|H OPU 83 (Terrimonas sp.) 2CN1 
I Terrimonas ferruginea, AM230484 

' Terrimonas lutea, AB 1 92292 

N-155(OPU84) 

n, FJ801224 
' Prochlorococcus marinus subsp. , AF1 80967 

□ candidate 



Spirochaetales 



I N-125(OPU87) 

P— uncultured bacterium, EF1 57271 

J Luteolibacter pohnpeiensis, AB331 895 

1 /.ureo/*acte/'a/gae,AB331893 

N-202 (OPU 94) 

Verrucomicrobium spinosum, X9051 5 

FJ543043 

(OPU 88) 

EU344928 
jncultured bacterium, EU135493 
■ Candidatus Xiphinematobacter americani, AF21 7460 
34 (OPU 89) 

AB369167 
Opitutus terrae, AJ229235 

OPU 90 (Singulisphaera sp.) 2CN1 





n, GQ472383 
S184 (OPU 124) 
■ uncultured eubacterium WCHB1-25, AF050559 

n,AB234546 



Candidate division OP8 
ultured bacteria 

Candidate division WCHB1 



Figure 3 A neighbour- joining tree of non-proteobacterial 16S 
rDNA gene sequences from the clone libraries established from N, 
Nbs, CN1 and CN2 community DNA. Layout, reconstruction and 
colour codes used are the same as in Figure 1. 



genera (Figure 1 and Supplementary Table S2). On 
the other hand, members of this lineage were barely 
represented in Nbs with just seven sequences, being 
BrevundimoncLS the most represented. None of these 
genera were detected in any of the enrichment 
cultures (Figure 1). In contrast, most of the alpha- 
proteobacterial clones in CNl (73%) were affiliated 
with Azospirillum species, that is, Azospirillum 
oryzae, whereas only three alphaproteobacterial 
sequences (out of a total of 108) were found in 
CN2, and only one was affiliated with Azospirillum 
(Figure 5 and Supplementary Table S2). 

Finally, the two N clones (out of 214) representing 
Betaproteobacteria affiliated with Acidovorax and 
Achromobacter spp. (Supplementary Table S2). On 
the other hand, Nbs sequences were importantly 
represented in this lineage with 27 sequences (10%), 
affiliated with Tetrathiobacter and Acidovorax. In 
this regard, the percentage of Betaproteobacteria 
found in the enrichment cultures was much higher 
(Figures 1 and 4); we observed an overrepresen- 
tation of sequences affiliated with the genera 
Comamonas and Achromobacter in CNl and 
Achromobacter in CN2 (Figures 1 and 5). In addi- 
tion to Delta- and Epsilon-proteobacteria, members 
of Tenericutes, Verrucomicrobia, Firmicutes and 
Cyanobacteria were only detected in samples N 
and Nbs (Figures 3 and 4). 

The above data demonstrated that the CNl and 
CN2 communities displayed considerably different 
phylogenetic structures (Figures 4 and 5), sharing 
only two OPUs: Beta- and Gamma-proteobacteria 
representatives of the denitrifying Achromobacter 
(that is, Achromobacter spanius) and Azospirillum 
(that is, Azospirillum doebereinerae) genera, respec- 
tively. Whereas the first one was most abundant in 
CN2 (29 vs 7 sequences), the second was particu- 
larly enriched in CNl (35 vs 1 sequences) 
(Supplementary Table S2). In addition to supplying 
biosurfactants, the addition of different nitrate 
compounds may have stimulated the growth of 
denitrifying Achromobacter spp. and P. stutzeri 
(Song and Ward, 2003) in the bio-stimulated com- 
munity, whereas depletion of nitrate may have 
favoured the presence of the nitrogen-fixing Azos- 
pirillum species in the non-stimulated community. 
In addition to denitrification, P. stutzeri is well 
known as naphthalene degrader, which has been 
isolated in such hydrocarbon enrichments from a 
wide range of environmental samples (Rossello- 
Mora et al., 1994), and was already an important 
component of the original Nbs soil. However, the 
possibility that the selective desorption and libera- 
tion of recalcitrant compounds from soils during the 
bio-stimulation process may have key functions in 
shaping the genomes and community structures 
of the associated microorganisms should not be 
ruled out. The findings of previous studies using 
BTEX biodegradation co-cultures (Shim et al., 2005) 
agree with this hypothesis. In any case, although 
bacteria related to Pseudomonas, Achromobacter 
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Figure 4 Comparison of bacterial phylotypes (cut-off of >97% sequence identity) based on the 16 S rDNA sequences extracted from N, 
Nbs, CN1 and CN2 community DNA. The percentages of bacterial phylogenetic lineages detected were based on OPUs. (a) Percentage of 
16S rDNA sequences distributed by phyla, (b) Contribution of dominant groups among Gammaproteobacteria. (c) Relative distribution of 
major bacterial phylotypes (based on 16S rDNA gene sequences (OPUs)) for soil communities CNl and CN2 at the genus level. 



Table 2 Statistical indexes for the four samples 





N 


Nbs 


CNl 


CN2 


Taxa (OPUs) 


51 


53 


13 


11 


Individuals (sequences) 


212 


261 


90 


107 


Dominance_D 


0.13 


0.06 


0.20 


0.30 


Shannon_H 


2.81 


3.32 


2.02 


1.50 


Good's_G 


0.76 


0.80 


0.86 


0.90 



Abbreviations: CNl, naphthalene-enriched community derived from 
N soil; CN2, naphthalene-enriched community derived from Nbs soil; 
OPU, operational phylogenetic unit; N, non-bio-stimulated 
polyaromatic hydrocarbon-contaminated soil; Nbs, bio-stimulated 
polyaromatic hydrocarbon-contaminated soil. 

PAST software vl.82b (Hammer et ah, 2001) was used to calculate the 
statistical indices for the bacterial sequences. The following formulas 
were used. Shannon-Weiner index: H= — 2 (ni/nt) Ln (ni/nt), where 
ni is the number of sequences of a particular OTU and nt is the total 
number of sequences. Dominance_D: D = E(ni/nt)2. Good's coverage: 
C = 1 - (ni/nt), where ni is the number of OTUs observed exactly once 
and nt is the total number of sequences. 



and Comamonas spp. have been widely associated 
with PAH degradation in a number of contaminated 
ecosystems and microbial consortia (Goyal and 
Zylstra, 1997), the presence of the Azospirillum 
species in contaminated soils and PAH-degrading 
bacterial consortia has only previously been 
reported in a heavily creosote-contaminated soil 
(Vinas et aL, 2005). This is of special interest 
because it is known that the recently sequenced 



Azospirillum sp. B510 harbours gene sets for 
degradation pathways, though their functions have 
not yet been experimentally elucidated (Kaneko 
et aL, 2010). 

Due to the clear shifts in community composition 
detected among the investigated conditions, it is 
important to establish the roles and degradation 
capabilities of the individual microbial members of 
the communities to understand the overall function 
of each community and its members. 



Genomic signatures associated with the aerobic 
degradation of aromatic s 

DNA isolated from each microbial community was 
sequenced using a Roche GS FLX DNA sequencer, at 
Lifesequencing SL (Valencia, Spain) which pro- 
duced 1471821 reads (N: 165 791; Nbs: 438 821; 
CNl: 448 713; CN2: 418 293) with an average length 
per read of 395 bp for CNl and CN2, 336 bp for N 
and 531 for Nbs. Accordingly, a total of 55.7, 233.3, 
177.2 and 165.2 Mbp of raw DNA sequences for N, 
Nbs, CNl and CN2 were obtained, which were 
assembled into 2.6 Mbp (4335 contigs), 17.9 Mbp 
(16 032 contigs), 20.0 Mbp (20 809 contigs) and 13.0 
Mbp (9915 contigs), respectively. CNl (266) and 
CN2 (256) contained a higher number of contigs 
with lengths greater than 10 kbp compared with Nbs 
(68) and to major extent N (only 1); this may be due 
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Figure 5 Relative abundances and distributions of gene coding 
sequences in the CNl and CN2 communities based on taxonomic 
bins for bacterial 16S rDNA gene sequences (OPUs) (blue), 
taxonomic bins for metagenome-derived genes encoding proteins 
that could be assigned a taxonomic annotation (red) and 
taxonomic categories of proteins that were identified in the 
metaproteomes (green). As shown, the metagenome and the 16S 
rDNA analyses yielded congruent results. Further, only a minor 
number of bacterial members seemed to be metabolically active 
in both communities, by meaning of expressed proteins binned 
to them. 



to the high diversity of the N sample, which resulted 
in a large pool of reads that could not be assembled 
(Supplementary Figure S7). Further information 
regarding the obtained metagenome characteristics 
is given in Supplementary Table S3. With a thresh- 
old of higher than 95% identity and an aligned 
length of more than 150bp, 91.7% (for N), 95.2% 
(for Nbs), 92.3% (for CNl) and 89.9% (for CN2) of 
the predicted genes were assigned to particular 
genera, the analysis of which showed results 
comparable to those found in the 16S rDNA assign- 
ments (see Figure 5 for CNl and CN2 comparison). 
Approximately 89% of the predicted genes (63 974 
in total) were assigned to COG protein families, and 
65% were assigned to KEGG pathways 
(Supplementary Table S3). Among the 1538 (for 



N), 2751 (for Nbs), 2775 (for CNl) and 2507 (for CN2) 
KEGG orthologs detected, 891 were shared between 
the two enrichment cultures, and 66% of these were 
putatively attributed to Achromobacter species. 

We first calculated the overrepresentation of 
functions between metagenomes; for that, we 
applied a z-test for independent proportions pro- 
posed by Li (2009) to statistically analyse the 
changes of the functional categories between sam- 
ples. It should be noticed that the comparison was 
restricted to samples Nbs, CNl and CN2 because 
the lower number of pyro sequences and COGs in 
sample N may have impact on the comparative 
result. We found a rather stable distribution of the 
functional categories composition between the sam- 
ples with significant different contribution of only 
238 out of 3293 COGs (Supplementary Table S4). 
Among them, 56/23 did show a significant increase 
in the bio-stimulated soil as compared with the 
CN1/CN2 enrichments, 58/26 increased in CNl as 
compared with Nbs/CN2 and 41/55 were over- 
represented in CN2 as compared with Nbs/CNl. 
Overall, we observed that both bio-stimulated com- 
munities were particularly enriched in COGs related 
to 'Replication and repair' and Translation' (29 vs 4 
distinct COGs in total in CNl). Overrepresentation 
of such functions is typical for microbial commu- 
nities developed under very dynamic environmental 
conditions, such as bio-stimulation; the addition of 
different nitrate compounds may have stimulated a 
competition between fast-growing organisms and 
organisms capable of metabolising poly-aromatics. 
By contrast, it is noteworthy that COGs related to 
'Transcription' were most likely characteristics of 
enrichment cultures (13 and 7 COG enriched in 
CNl and CN2, respectively), whereas only two 
(COG0789 and COG1974) were enriched in Nbs 
(Supplementary Table S4). This suggests a plausible 
scenario in which the stress caused by aromatic 
compounds during the naphthalene cultivation led 
to an enrichment of genomes containing transcrip- 
tional elements that could be required for stress 
endurance (Dommguez-Cuevas et ah, 2006) and/or 
activation of functions required for substrate uptake. 
Ten distinct COGs within the 'Cell wall, membrane, 
envelop biogenesis' category were found overrepre- 
sented in CNl, a number much higher that those 
found in Nbs (four COGs) and in CN2 (two COGs) 
(Supplementary Table S4). In addition, CNl was 
also particularly enriched in 'ABC Transport Sys- 
tems' by meaning of genes distributed in 11 distinct 
overrepresented COGs, for which only three and two 
were found in Nbs and CN2, respectively. COG0318 
acetyl-CoA synthethase was also associated to CNl 
community; enzymes of the COG0318 catalyse the 
formation of acetyl-CoA from acetate, suggesting 
that acetate may be a major end product used to 
produce acetyl-CoA feeding the Krebs pathway in 
members of CNl (in agreement with proteomic data 
that will be discussed below). By contrast, CN2 was 
particularly enriched in genes coding proteins 
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containing Fe-S clusters, namely COG3210 (large 
exo-proteins involved in haem utilisation or adhesion), 
COG1049 (aconitase B), COG0348 (polyferredoxin) as 
well as COG4773 (outer membrane receptor for ferric 
coprogen and ferric-rhodotorulic acid) (Supplementary 
Table S4), indicating iron being particularly relevant 
for CN2 community members. Taken together, the 
most notable observations drawn from all these data 
sets are that bio-stimulated communities appear to 
be extensive reservoirs of genes involved in replica- 
tion, repair and translation and iron adhesion 
and utilisation whereas bacterial components of 
non-stimulated communities seems to be most 
active in cell wall, membrane and envelop biogen- 
esis; in addition, enrichment most likely favoured 
transcriptional events. 

The obtained coverage was sufficient to produce a 
further partial theoretical metabolic reconstruction 
(including both exclusive and common capabilities) of 
the aerobic aromatic catabolic routes in the investigated 



communities; although these reconstructions were 
incomplete and likely represent composite cell 
networks, the information obtained may be suffi- 
cient to achieve a better understanding of how the 
four communities behave regarding their biodegra- 
dation capabilities on a genomic scale. To do that, a 
proper functional assignment of the predicted genes 
was performed using an in-house database contain- 
ing protein sequences with biochemical functions 
shown to be involved in biodegradation (for details 
see Supplementary Materials and methods). Accord- 
ing to this protocol, 428 (N: 38; Nbs: 115; CNl: 132; 
CN2: 143) open-reading-frame fragments were iden- 
tified as showing close sequence similarity to genes 
that encode enzymes known to be involved in the 
aerobic metabolism of aromatics via di- and tri- 
hydroxylated intermediates and, accordingly, pre- 
sumptive functions were assigned. The overall 
gene features and presumptive catabolic routes 
most likely associated to them, are shown in 




Figure 6 Potential aerobic degradation networks of aromatics via di- and tri-hydroxylated intermediates in the four investigated communities. 
The colour code used for the respective pathways is as follows: black, all samples; green, N and Nbs (specific for soil communities); blue, CNl 
(N) (specific for non bio-stimulated communities]; red, CN2 (Nbs) (specific for bio-stimulated communities). Illustrations were created by 
ChemDraw graphic programme Chem Draw Ultra 8.0 (CambridgeSoft; http://www.cambridgesoft.com/) based on substrate specificity of 
enzymes listed in Supplementary Table S5, and the corresponding metabolic pathways were established based on bibliographic records. It 
should be noticed that the identification of a particular activity in CNl and CN2 implies its presence in N and Nbs, respectively, although their 
signatures were not identified in the later most likely owing to their higher biodiversity and low coverage of the metagenome. Accordingly, 
those activities were indicated as CNl (N) and CN2 (Nbs). 
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Supplementary Table S5 and Figure 6, the results of 
which are described in details below. It should be 
noticed that the lower number of hits in the original 
community (N) may be due to the broad spectrum of 
carbon sources that are present in the soil and may 
have been biased by the lower number of assembled 
sequences due to the high diversity and the 
corresponding low coverage of the metagenome in 
this community. 

Eighteen/thirteen types of oxygenases (including 
the component systems reductase and ferredoxin 
and large and small subunits of the terminal 
oxygenase component) and 3/13 types of accessory 
enzymes involved in subsequent degradation or 
transport steps were identified in the N/Nbs soil 
samples, with the majority (~86% and 53%, 
respectively) being binned to strain(s) of the Pseu- 
domonas genus (Supplementary Table S5). Among 
the most significant enzymes were those involved in 
potential degradation pathways for naphthalene, 
biphenyl, phenol and carbazole (Figure 6). The 
absence of genomic signatures for carbazole degra- 
dation in the enriched communities further suggests 
that either the carbazole-degrading bacterium (or its 
plasmid) containing such genes was lost, or was 
present at a very low abundance, during the 
enrichment process. 

A total of 275 (CNl: 132; CN2: 143) open-reading- 
frame fragments covering 186 (CNl: 83; CN2: 103) 
unique proteins were identified in the enriched 
cultures, of which 152 (CNl: 62; CN2: 90) were 
characterised as oxygenase components, 43 (CNl: 
11; CN2: 32) as accessory enzymes and 9 (CNl: 5; 
CN2: 4) as probable regulators. The functional 
analysis of these gene sets (Supplementary Table 
S5) suggested that microbial communities from 
enrichment samples and, in turn their correspond- 
ing soil samples, possessed a number of common 
metabolic degradation signatures (Figure 6), 
whereas their phylogenetic composition is signifi- 
cantly different; however, a number of community- 
specific genomic determinants were also found, 
which may be of particular interest regarding the 
possible energy sources available to community 
members (Eaton, 1996). 

We first obtained genomic evidence of the pre- 
sence of genes belonging to the naphthalene upper 
pathway leading to the formation of salicylate 
(Eaton and Chapman, 1992) in both enrichment 
cultures, including naphthalene dioxygenases (CNl: 
1; CN2: 2) and accessory enzymes (CNl: 6; CN2: 8) 
(Supplementary Table S5 and Figure 6). Addition- 
ally, the identification of salicylate-5-hydroxylases 
(CNl: 5; CN2: 5) and salicylate-l-hydroxylases 
(CNl: 0; CN2: 2) provided evidence that although 
the salicylate-to-gentisate pathway operated in both 
communities, the salicylate-to-catechol pathway 
most likely occurs only in CN2. Both communities 
contained genomic signatures suggesting their abil- 
ity to transform phenol to catechol (Supplementary 
Table S5 and Figure 6). In CN2, a multicomponent 



phenol hydroxylase (^69% sequence identity to 
7apKLMNOP of P. alkylphenolia) (Jeong et al, 2003) 
was encoded in a single contig (contig01289) that 
also contained enzymes involved in transforming 
catechol to Krebs cycle intermediates (LapBCEHGIR, 
an acetylaldehyde dehydrogenase and a transpor- 
ter); in CNl, a phenol oxygenase component was 
detected, although neither a clear indication of a 
complete pathway nor a clear phylogenetic affilia- 
tion could be established. The CNl metagenome did 
not contain genomic determinants associated with 
the salicylate-to-catechol pathway, but did contain 
genes encoding enzymes involved in the ortho- 
cleavage of catechol (Supplementary Table S5); 
however, this is not surprising, considering that 
the catechol pathway is widespread among Proteo- 
bacteria (Perez-Pantoja et al., 2008). Catechol is a 
central metabolite in the biphenyl/benzoate degra- 
dation pathways, and six benzoate dioxygenases 
(CNl: 2; CN2: 4), four 2,3-dihydroxybiphenyl diox- 
ygenases (CNl: 1; CN2: 3) and two benzoate 
dihydrodiol dehydrogenases (CN1:0; CN2:2) were 
identified in the two communities (Supplementary 
Table S5). Other common degradation capabilities 
observed in both communities included the poten- 
tial to degrade 4-hydroxyphenylacetate via homo- 
protocatechuate, which may be transformed by 
homoprotocatechuate 2,3-dioxygenases (CNl: 2; 
CN2: 2); furthermore, 4-hydroxyphenylpyruvate 
may be transformed by 4-hydroxyphenylpyruvate 
dioxygenases (CNl: 4; CN2: 5) to homogentisate, 
which can be cleaved by homogentisate 1,2-dioxy- 
genases (CNl: 5; CN2: 3) (Supplementary Table S5 
and Figure 6). The identification of a 2-amino-l,6- 
dioxygenase in both communities suggests that 
aminophenol may be also a potential common 
carbon source. It was recently reported that nico- 
tinic acid, an n-heterocyclic carboxylic derivative of 
pyridine, that is essential for microorganisms, can 
also be used as a carbon and nitrogen source 
(Jimenez et al., 2008); this substrate enters the Krebs 
cycle via the action of a number of enzymes, and 
degradation is initiated by a 6-hydroxynicotinate 
3-monooxygenase (Jimenez et al., 2008). Accord- 
ingly, two putative 6-hydroxynicotinate 3-monoox- 
ygenases (CNl: 1; CN2: 1) were identified 
(Supplementary Table S5). Further genomic evi- 
dence suggested the presence of metabolic routes 
associated with vanillate (vanillate monooxygenase; 
CNl: 2; CN2: 2) and p-hydroxybenzoate (4-hydro- 
xybenzoate 3 -hydroxylase; CNl: 3; CN2: 2) 
(Supplementary Table S5 and Figure 6). 

Genomic evidence of the capability to potentially 
transform phthalate (four phthalate dioxygenases) 
and terephthalate (one terephthalate dioxygenase) 
into protocatechuate was only observed in CNl 
(Supplementary Table S5 and Figure 6). Other 
potential growth-supporting aromatic compounds 
found to be CNl specific were gallate (through a 
gallate dioxygenase) , 2 , 3 -dihydroxyphenylpropio- 
nate (through a 2,3-dihydroxy-phenylpropionate 
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dioxygenase), 2,3-dihydroxy-p-cumate (through a 
2,3-dihydroxy-p-cumate dioxygenase), indole carboxy- 
late, polycyclic arene diols (PhnC-like polycyclic 
arene diol extradiol dioxygenase (PhnC-like)) and 
diterpenoids, such as abietic acid and its inter- 
mediate 7-oxo-ll,12-dihydroxy-8,13-abietadiene acid 
(through an abietane diterpenoid dioxygenase and 
the corresponding 7-oxo-ll,12-dihydroxydehydro- 
abietate dioxygenase, which has an unclear phylo- 
genetic affiliation) (Supplementary Table S5 and 
Figure 6). By contrast, genomic signatures for 
the potential utilisation of ibuprofen, through an 
ibuprofen-CoA dioxygenase, were only found in 
bio-stimulated derived community (Supplementary 
Table S5). 

The functional phylogenetic assignments further 
indicated potential major 'food' webs that were 
primarily composed of Pseudomonas gammaproteo- 
bacterial species (most likely operating in both soils 
and CN2 communities, but not in CNl), Azospir- 
illum alphaproteobacterial species (operating in 
non-stimulated communities) and betaproteobacter- 
ial Achromobacter (operating in all communities) 
and Comamonas (operating in non-stimulated com- 
munities) species. Based on the phylogenetic ana- 
lysis provided in Supplementary Table S5, different 
interactions among these members can be hypothe- 
sised. Achromobacter spp. may supply the genetic 
potential for transforming naphthalene to salicylate, 
benzoate/biphenyl to catechol, 4-hydroxybenzoate, 
4-hydroxyphenylpyruvate, ibuprofen, vanillate, 
homogentisate, (homo-) protocatechuate and 
cumate. Potential activities of Azospirillum may be 
responsible for the ability to transform salicylate to 
gentisate, 4-hydroxybenzoate and 4-hydroxyphenyl- 
pyruvate. The potential roles of Pseudomonas may 
cover the abilities to transform carbazole to catechol, 
naphthalene to salicylate to gentisate and catechol, 
benzoate/biphenyl/phenol to catechol, 4-hydroxy- 
benzoate, 4-hydroxyphenylpyruvate, vanillate and 
protocatechuate. The potential activities of Coma- 
monas may include the transformation of 4-hydro- 
xybenzoate, 4-hydroxyphenylpyruvate, vanillate, 
gallate, nicotinate, 2,3-dihydroxyphenylpropionate 
and (tere-) phthalate. 

Taken together, the data may have a significant 
ecological impact while it is plausible that bio- 
stimulation per se favours 'specialists' for naphtha- 
lene-degradation, non-stimulated communities may 
display a higher metabolic plasticity to access 
pollutants with complex chemical structures and 
physical properties. This pattern was clearly 
demonstrated as bio-stimulated community mem- 
bers presented two different routes for salicylate 
utilisation, in contrast to the non-stimulated ones, 
which were only capable of metabolising naphtha- 
lene via gentisate but have the genetic potential to 
degrade a range of carboxylated aromatics (Figure 6), 
for which no genomic signatures were found in the 
bio-stimulated communities. This should be of 
special interest in defining future strategies for 



implementing enrichment cultures in settings asso- 
ciated with biodegradation based on the pollutant 
profile and thus producing a-la-carte degrading 
communities. 



Proteomic blueprints of CNl and CN2 naphthalene- 
enrich ed comm. unities 

A total of 1116 proteins were unambiguously 
identified and quantified from cells harvested from 
enrichment cultures CNl and CN2 (60 transfers) 
used also for DNA isolation, following the protocol 
and criteria described in Supplementary Materials 
and Methods (Supplementary Table S6). Because 
the analysis was performed using a draft metagen- 
ome, the metaproteome size is within common 
ranges that have been observed for other commu- 
nities and is only three times lower than that 
observed for cultivable organisms (Benndorf et al., 
2007; Passalacqua et al, 2009; Abu Laban et al, 
2010; Selesi et al, 2010). As the majority of the 
spectra could be assigned to a taxonomic and 
functional annotation based on highly similar 
homologues, the metaproteomic approach applied 
here allowed us to compare taxonomic annotations 
to evaluate the differences between the contribu- 
tions of particular groups of organisms to the global 
community, as well as to predict the importance of 
particular sets of proteins for the overall functioning 
of the community. 

A total of 582 (or 52%) proteins were identified in 
both samples, with the relative intensities of 123 
and 227 proteins being significantly higher in CNl 
and CN2, respectively (Supplementary Table S6). 
Additionally, a total of 132 proteins were specific 
to CNl, most of which were affiliated with the 
Azospirillum (~71%) and Comamonas (~17%) 
species, whereas 402 proteins (~94% associated 
with Gammaproteobacteria of the Pseudomonas 
genus) were found exclusively in CN2. This clearly 
indicated that the CNl and CN2 communities 
displayed considerable heterogeneity, in accordance 
with their distinct phylogenetic compositions 
(Figures 4 and 5). On the basis of their probable 
functions, the proteins were sorted into five major 
groups, which were associated with roles in aerobic 
degradation, electron acceptors, energy metabolism, 
transport and folding/stress. 

(i) Aerobic degradation proteins: As expected for 
the communities utilising aromatics as their sole 
carbon source, peptides from 27 unique proteins 
directly assigned to different aromatic degradation 
pathways were found in the metaproteomes, albeit 
at different intensities (Supplementary Table S7). 
The peptides included 19 proteins belonging to the 
naphthalene degradation pathway with gentisate as 
an intermediate, with abundance levels up to a 
relative concentration (rel. cone.) of 2.1%. Seven- 
teen proteins were observed at higher abundances in 
CNl compared with CN2, whereas two proteins 
were CNl and CN2-specific. Additionally, four 
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proteins involved in the catechol meta-cleavage 
pathway were found to be most expressed in CN2, 
with remarkable expression levels up to a 0.53% 
rel. cone. Expression of a glutathione S transferase 
(NagJ-like) and a membrane protein of unknown 
function was observed and may be implicated 
in naphthalene degradation. It should be noted that 
the most abundant protein in CNl showed a 
relative concentration of ~3.6%, coding for the 
naphthalene dioxygenase alpha and beta subunits 
(Supplementary Table S6); therefore, proteins 
involved in naphthalene degradation were among 
those that were most highly expressed (that is, CNl/ 
23999 gentisate dioxygenase, with a 2.0% rel. cone), 
as expected for a community utilising naphthalene as 
the only carbon source. Similarly, in the case of CN2, 
the most abundant protein was a salicylaldehyde 
dehydrogenase (1.7% rel. cone.) (Supplementary 
Table S6). 

Phylogenetic analysis suggests that in CNl the 
primary proteins that are active are identical to 
those described in the pWWU2 plasmid of Ralstonia 
sp. U2, and most likely binned to several species or 
genomes of Achromobacter (Supplementary Table 
S7). In contrast, the degradation of naphthalene in 
CN2 may be 'presumptively' carried out both by 
organisms that utilise a catechol pathway similar to 
the pathway encoded by the plasmid pND6-l of 
Pseudomonas sp. strain ND6 and those that employ 
gentisate pathways; however, in the latter case, 
at least two different type of organisms may be 
involved, as proteins identical (similarity ranging 
from 57 to 100%) to those encoded by the pWWU2 
and pND6-l plasmids from Ralstonia sp. strain U2 
(in CNl and CN2) and Pseudomonas sp. strain ND6 
(in CN2), respectively, were found (Supplementary 
Tables S6 and S7). Additional organisms may have 
minor roles in the naphthalene degradation network 
in both cultures, as indicated by the expression 
of CNl/3872 gentisate dioxygenase, which has a 
sequence identity of only 53% to one of Ralstonia 
sp. U2 (Supplementary Table S7). In any case, it 
should be noted that among a total of 62 unique 
proteins taking part in the putative naphthalene 
degradation network based on the obtained meta- 
genome sequence data (Supplementary Table S5), 
only approximately 43% of the proteins passed 
the strict ^2 unique peptides filtering criterion 
(Supplementary Table S7). As the physico-chemical 
features of peptides that define their suitability of 
detection by mass spectrometry are more or less 
equally distributed the likeliness for detecting a 
protein is mostly a matter of abundance. Thus, 
the set of label-free detected proteins is by itself 
indicative for both the abundance of the producing 
species and the protein of interest. 

(ii) Proteins involved in electron acceptors: A total 
of seven genes (covered by five distinct orthologs: 
K00370, K00371, K00373, K00374 and K00376) 
involved in nitrate (NO" 3 ) and nitric/nitrous oxide 
(N x O) metabolism were found among the genes that 



were exclusive to the CN2 metagenome, the majority 
(~90%) of which were associated with a single 
(or several) species or with a Pseudomonas genome 
(most likely P. stutzeri). Consistent with this finding, 
significant expression of a nitric-oxide reductase 
(CN2/16494; 0.25% rel. cone), a nitrous-oxide 
reductase (CN2/13702; 0.30% rel. cone), and four 
nitrate reductases (CN2/1030, CN2/16257, CN2/ 
15634 and CN2/1029; from 0.01% to 0.20% rel. 
cone.) was found in the CN2 metaproteome 
(Supplementary Table S6). The presence of such 
enzymes suggests that N x O may be used as an 
alternative (or dominant) electron acceptor (Hemme 
et al., 2010) by members of the CN2 community, 
most likely by Pseudomonas stutzeri strain(s) that 
are capable of denitrification (Chen et al., 2011). 

(iii) Proteins involved in energy metabolism: 
Enzymes belonging to the naphthalene degradation 
pathway convert the naphthalene to pyruvate and 
fumarate, which feed the Krebs cycle. A total of 72 
central metabolic enzymes were found in both the 
CNl and CN2 metaproteomes during growth using 
naphthalene, including 41 proteins involved in the 
Krebs and glyoxylate cycles (Supplementary Table 
S6), which binned majoritarily to several species or 
genomes of Achromobacter, Comamonas and Azo- 
spirillum in CNl and Achromobacter and Pseudo- 
monas in CN2. The enzyme diversity and taxonomic 
classification of the expressed proteins suggested 
that during growth with naphthalene, acetyl-CoA 
was metabolised in CNl and CN2 via both the Krebs 
and the glyoxylate cycles. However, it is noteworthy 
that the expression of citrate synthase (for example, 
CN2/0977; 0.37% rel. cone.) in CN2 was approxi- 
mately threefold higher than the expression of 
malate synthase (for example, CN2/15317; 0.11% 
rel. cone.), whereas in CNl, an opposite scenario 
was observed, as malate synthase (for example, 
CN1/16907; 0.07% rel. cone.) was expressed at a 
higher level than citrate synthase (0.0096% rel. 
cone.) (Supplementary Table S6). This suggests that 
acetyl-CoA may be preferentially metabolised via 
the Krebs cycle in CN2, whereas it may be 
preferentially metabolised via the glyoxylate cycle 
in CNl. Kurbatov et al. (2006) observed that during 
the growth of P. putida KT2440 on phenol, the 
glyoxylate cycle was preferred, suggesting that bio- 
stimulation prior to the enrichment may also have a 
direct influence on the overall preferred mode of 
energetic metabolism in settings associated with 
biodegradation. 

Binning analysis suggests Pseudomonas species 
(in CN2) most likely use both pyruvate and fumarate 
(generated from naphthalene) as substrates to 
feed the Krebs and glyoxylate pathways, whereas 
members of CNl primarily use pyruvate. This can be 
seen clearly in the abundances of fumarate hydra- 
tases and succinate dehydrogenases, which were 
observed at significant levels only in the CN2 
proteome (seven proteins in total, all binned to 
the Pseudomonas genus: CN2/0973, CN2/0975, 



The ISME Journal 



Microbial response to polyaromatics 

M-E Guazzaroni et at 



134 

CN2/0974, CN2/7221, CN2/17464, CN2/17714 and 
CN2/17817, showing up to a 0.46% rel. cone.) 
(Supplementary Table S6). 

The metaproteomic analysis revealed the presence 
of five acetyl-CoA synthases (3 binned to Azospir- 
illum in CNl and 1 to Achromobacter and 1 to 
Pseudomonas in CN2; up to a 0.51% rel. cone; 
Supplementary Table S6), which suggests acetate, 
in combination with pyruvate (produced from 
naphthalene), is a major end product used to 
produce acetyl-CoA feeding the Krebs and glyox- 
ylate pathways. In CNl, acetyl-CoA may addition- 
ally be produced from acetate via the action of an 
Azospirillum-deiived formate c-acetyltransferase 
(CNl/4225), which was detected at a low expression 
level (0.02% rel. cone.) (Supplementary Table S6); 
moreover, in CNl, acetate may be used as an energy 
source via an acetate kinase (0.04% rel. cone), 
which uses ATP to produce acetyl phosphate. 
Taking these findings together, we suggest that the 
production and utilisation of acetate yielding max- 
imal energy while avoiding acidification of the 
environment owing to the accumulation of acetate 
is potentially more efficient in CNl. Together with 
the previous observations related to the Krebs and 
glyoxylate cycles, these results suggest that distinct 
energetic networks operate in both communities 
during naphthalene utilisation. 

(iv) Proteins involved in transport. Both enrichment 
cultures contained a large number of cytoplasmic 
membrane transport proteins, which may facilitate the 
uptake of a multitude of organic compounds: strong 
production of 69 (up to a 2.28% rel. cone.) and 24 
(up to a 1.55% rel. cone.) proteins associated with 
(in) organic molecule uptake systems (including trans- 
porters, TolB proteins and porins) was evident in the 
CNl and CN2 metaproteomes, respectively (Supple- 
mentary Table S6). The large number of significantly 
expressed transporters in CNl is in agreement with 
the profiling analysis based on the COG-enrichment 
values calculated for each microbiomes (Supple- 
mentary Table S4), which showed an overrepresen- 
tation of COGs classified into the 'ABC transport 
systems' in the non-stimulated enriched community. 

(v) Proteins involved in folding and stress: 
Proteins involved in detoxification and oxidative 
stress response mechanisms were identified only in 
CNl (22 hits) (Supplementary Table S6). Interest- 
ingly, all of these proteins were binned to Azospir- 
illum, which may suggest a major level of radical 
stress on these organisms compared with others 
within the community. Among the stress proteins 
detected, one peroxidase and one catalase (CNl/ 
2218 and CNl/18517) and two putative xenobiotic 
reductases (CNl/14685 and CNl/16278) were iden- 
tified, with relative abundance levels ranging from 
0.09 to 0.19%. Expression of these types of stress 
proteins has been reported in several studies related 
to the responses of organisms to different sources of 
carbon and energy, for example, as a response to 
phenol (Kurbatov et ah, 2006). However, to the best 



of our knowledge, no experimental (for example, 
proteomic) evidences, suggesting higher stress 
responses of Azospirillum strains compared with 
other degraders in contaminated or aromatic- 
enriched cultures, have been obtained. In contrast, 
high abundances (21 hits) of Pseudomonas- and 
Achromobacter-biimed stress proteins potentially 
involved in protein folding mechanisms were iden- 
tified in CN2, which included CspA, Clp, GroEL, 
DnaJ and thiol-disulphide interchange-like proteins 
(Supplementary Table S6). Taken together, these data 
suggest the action of distinct coping mechanisms in 
response to naphthalene-derived stress during the 
enrichment process. Members of CNl appear 
to be more affected by radical oxygen species most 
likely produced during naphthalene degradation 
(for example, via enzyme inefficiencies and/or 
dioxygenase inhibition), whereas stress factors 
derived from naphthalene itself appear to be the 
major determinants affecting individual members, 
and thus, modulating the structure of the CN2 
community. 

All together, the present study provided a com- 
prehensive and comparative investigation of four 
microbial communities containing several OPUs 
associated with putative metabolism related to 
degradation capabilities. Their inter-alia interac- 
tions were unrevealed, which further demonstrated 
that a minor fraction of active genomes (and the 
proteins they contain) is responsible for the degra- 
dation of naphthalene both in soil samples and 
enrichment cultures (Figure 5). It is interesting to 
mention with respect to the aspects discussed here, 
that the enrichment serial batch cultures used for 
CNl and CN2 could select for the fast growers 
species, which could also have some impact in the 
functional phylogenetic structure found in this 
investigation. Whatever the case, the present study 
revealed that the soil ecosystem (that is, a PAH- 
contaminated soil) investigated here was quite 
delicate and that its metabolic capability may be 
easily lost as a result of bio-stimulation, which 
could have a significant ecological role. Finally, 
the curated database containing protein sequences 
with biochemical functions, shown to be involved 
in biodegradation of aromatics via di- and 
tri-hydroxylated intermediates and are created 
and applied for first time in this study, could be 
used in future investigation related to microbial 
biodiversity, ecology and function as response to 
aromatics. 
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